A Replication Study of the Top Performing Systems in SemEval Twitter Sentiment Analysis

نویسندگان

  • Efstratios Sygkounas
  • Giuseppe Rizzo
  • Raphaël Troncy
چکیده

We performed a thorough replicate study of the top performing systems in the yearly SemEval Twitter Sentiment Analysis task. We highlight and discuss differences among the results obtained by those systems that have been officially published and the ones we are able to compute. Learning from the studies being made on the systems, we also propose SentiME, an ensemble system composed of five state-of-the-art sentiment classifiers. SentiME trains the different classifiers using the Bootstrap Aggregating Algorithm. The classification results are then aggregated using a linear function that averages the classification distributions of the different classifiers. SentiME has also been evaluated over the SemEval2015 test set, properly trained with the SemEval2015 train test. We show that SentiME would outperform the best ranked system of the challenge.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sentibase: Sentiment Analysis in Twitter on a Budget

Like SemEval 2013 and 2014, the task Sentiment Analysis in Twitter found a place in this year’s SemEval too and attracted an unprecedented number of participations. This task comprises of four sub-tasks. We participated in subtask 2 — Message polarity classification. Although we lie a few notches down from the top system, we present a very simple yet effective approach to handle this problem th...

متن کامل

INESC-ID: A Regression Model for Large Scale Twitter Sentiment Lexicon Induction

We present the approach followed by INESCID in the SemEval 2015 Twitter Sentiment Analysis challenge, subtask E. The goal was to determine the strength of the association of Twitter terms with positive sentiment. Using two labeled lexicons, we trained a regression model to predict the sentiment polarity and intensity of words and phrases. Terms were represented as word embeddings induced in an ...

متن کامل

iLab-Edinburgh at SemEval-2016 Task 7: A Hybrid Approach for Determining Sentiment Intensity of Arabic Twitter Phrases

This paper describes the iLab-Edinburgh Sentiment Analysis system, winner of the Arabic Twitter Task 7 in SemEval-2016. The system employs a hybrid approach of supervised learning and rule-based methods to predict a sentiment intensity (SI) score for a given Arabic Twitter phrase. First, the supervised method uses an ensemble of trained linear regression models to produce an initial SI score fo...

متن کامل

SemEval-2015 Task 11: Sentiment Analysis of Figurative Language in Twitter

This report summarizes the objectives and evaluation of the SemEval 2015 task on the sentiment analysis of figurative language on Twitter (Task 11). This is the first sentiment analysis task wholly dedicated to analyzing figurative language on Twitter. Specifically, three broad classes of figurative language are considered: irony, sarcasm and metaphor. Gold standard sets of 8000 training tweets...

متن کامل

SAP-RI: Twitter Sentiment Analysis in Two Days

We describe the submission of the SAP Research & Innovation team to the SemEval 2014 Task 9: Sentiment Analysis in Twitter. We challenged ourselves to develop a competitive sentiment analysis system within a very limited time frame. Our submission was developed in less than two days and achieved an F1 score of 77.26% for contextual polarity disambiguation and 55.47% for message polarity classif...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016